분석 진행 과정
1. 6가지 조건을 바탕으로 점수를 부여하고, 3점 이상에 해당되는 허위매물들을 추출
2. 회귀모델을 통해 다양한 조건을 바탕으로 허위매물 추출
3. 두 결과를 바탕으로 공통 7개의 허위매물 추출
회귀분석 모델을 적용하여 허위매물을 찾아낼 경우, 점수제로 추려진 허위매물과 무엇이 같고, 무엇이 다른지 비교가능
RidgeCV(alphas=array([1.00000000e-04, 3.59381366e-04, 1.29154967e-03, 4.64158883e-03,
1.66810054e-02, 5.99484250e-02, 2.15443469e-01, 7.74263683e-01,
2.78255940e+00, 1.00000000e+01]),
cv=5, scoring='neg_mean_squared_error')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. RidgeCV(alphas=array([1.00000000e-04, 3.59381366e-04, 1.29154967e-03, 4.64158883e-03,
1.66810054e-02, 5.99484250e-02, 2.15443469e-01, 7.74263683e-01,
2.78255940e+00, 1.00000000e+01]),
cv=5, scoring='neg_mean_squared_error')설명력 (R²): 0.633
최적 α (alpha): 10
▶ 전체 샘플 수: 662개
▶ 허위매물 수: 19개 (2.9%)
| Neighborhood | SalePrice | predicted | residual | |
|---|---|---|---|---|
| 309 | Edwards | 184750 | 341478.416904 | -156728.416904 |
| 2204 | OldTown | 90000 | 165060.186265 | -75060.186265 |
| 469 | OldTown | 122000 | 196637.789395 | -74637.789395 |
| 1909 | OldTown | 97500 | 165711.760154 | -68211.760154 |
| 740 | IDOTRR | 40000 | 100640.665189 | -60640.665189 |
| 116 | OldTown | 159500 | 219067.388226 | -59567.388226 |
| 1214 | OldTown | 107500 | 165796.538413 | -58296.538413 |
| 254 | OldTown | 133900 | 187704.330474 | -53804.330474 |
| 1436 | OldTown | 106000 | 158650.035771 | -52650.035771 |
| 677 | OldTown | 103500 | 155031.742968 | -51531.742968 |
| 374 | BrkSide | 106900 | 158205.245880 | -51305.245880 |
| 1225 | OldTown | 117000 | 165247.245319 | -48247.245319 |
| 2277 | IDOTRR | 123000 | 171019.817386 | -48019.817386 |
| 2025 | OldTown | 117500 | 163763.677821 | -46263.677821 |
| 205 | IDOTRR | 50000 | 95575.542686 | -45575.542686 |
| 528 | IDOTRR | 89500 | 130977.645762 | -41477.645762 |
| 1064 | OldTown | 64500 | 105347.680388 | -40847.680388 |
| 427 | OldTown | 12789 | 53436.186714 | -40647.186714 |
| 22 | MeadowV | 98000 | 137418.527580 | -39418.527580 |
RidgeCV(alphas=array([1.00000000e-04, 3.59381366e-04, 1.29154967e-03, 4.64158883e-03,
1.66810054e-02, 5.99484250e-02, 2.15443469e-01, 7.74263683e-01,
2.78255940e+00, 1.00000000e+01]),
cv=5, scoring='neg_mean_squared_error')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. RidgeCV(alphas=array([1.00000000e-04, 3.59381366e-04, 1.29154967e-03, 4.64158883e-03,
1.66810054e-02, 5.99484250e-02, 2.15443469e-01, 7.74263683e-01,
2.78255940e+00, 1.00000000e+01]),
cv=5, scoring='neg_mean_squared_error')설명력 (R²): 0.746
최적 α (alpha): 0.0001
▶ 전체 샘플 수: 1464개
▶ 허위매물 수: 41개 (2.8%)
| Neighborhood | SalePrice | predicted | residual | |
|---|---|---|---|---|
| 180 | NWAmes | 82500 | 186512.846249 | -104012.846249 |
| 997 | NAmes | 84900 | 164088.769037 | -79188.769037 |
| 1262 | Sawyer | 112000 | 189448.582663 | -77448.582663 |
| 1703 | Gilbert | 164000 | 237690.994626 | -73690.994626 |
| 607 | SawyerW | 131000 | 203319.538648 | -72319.538648 |
| 232 | NAmes | 97500 | 165959.754915 | -68459.754915 |
| 748 | NWAmes | 154000 | 222452.575310 | -68452.575310 |
| 777 | NAmes | 140000 | 207071.732788 | -67071.732788 |
| 2207 | Sawyer | 158000 | 222634.022503 | -64634.022503 |
| 1735 | NAmes | 180000 | 239892.023341 | -59892.023341 |
| 1777 | Sawyer | 130500 | 189388.251889 | -58888.251889 |
| 328 | NAmes | 100000 | 157610.279322 | -57610.279322 |
| 1592 | NAmes | 152500 | 209835.479856 | -57335.479856 |
| 1973 | NAmes | 110000 | 167110.808559 | -57110.808559 |
| 2399 | Crawfor | 135000 | 191693.656661 | -56693.656661 |
| 2478 | Crawfor | 149000 | 205373.529713 | -56373.529713 |
| 1392 | Mitchel | 115000 | 170705.874780 | -55705.874780 |
| 379 | NAmes | 104900 | 160448.835040 | -55548.835040 |
| 2165 | Crawfor | 137000 | 192003.206946 | -55003.206946 |
| 1085 | NAmes | 132000 | 184829.510696 | -52829.510696 |
| 1790 | NAmes | 139000 | 191731.983243 | -52731.983243 |
| 445 | Sawyer | 112000 | 164354.269125 | -52354.269125 |
| 289 | NAmes | 167000 | 219307.354724 | -52307.354724 |
| 2293 | ClearCr | 148400 | 200423.025348 | -52023.025348 |
| 79 | SawyerW | 67500 | 119207.948646 | -51707.948646 |
| 2044 | NAmes | 133000 | 184566.013933 | -51566.013933 |
| 1259 | NWAmes | 170000 | 219375.681066 | -49375.681066 |
| 1533 | Sawyer | 62383 | 111275.071760 | -48892.071760 |
| 1955 | Crawfor | 191000 | 239871.517587 | -48871.517587 |
| 1557 | SawyerW | 138500 | 186704.367535 | -48204.367535 |
| 478 | Sawyer | 119500 | 166866.155117 | -47366.155117 |
| 2113 | Crawfor | 145000 | 190775.062653 | -45775.062653 |
| 657 | NPkVill | 123000 | 168657.234940 | -45657.234940 |
| 2085 | Gilbert | 115000 | 159734.356699 | -44734.356699 |
| 112 | NWAmes | 185000 | 229329.132298 | -44329.132298 |
| 1276 | NAmes | 143000 | 187007.097643 | -44007.097643 |
| 2397 | NAmes | 242000 | 285697.477460 | -43697.477460 |
| 661 | NAmes | 135000 | 178659.552930 | -43659.552930 |
| 585 | Mitchel | 160000 | 203299.251232 | -43299.251232 |
| 793 | Sawyer | 121500 | 164707.980590 | -43207.980590 |
| 1752 | Blueste | 121000 | 164024.916824 | -43024.916824 |
RidgeCV(alphas=array([1.00000000e-04, 3.59381366e-04, 1.29154967e-03, 4.64158883e-03,
1.66810054e-02, 5.99484250e-02, 2.15443469e-01, 7.74263683e-01,
2.78255940e+00, 1.00000000e+01]),
cv=5, scoring='neg_mean_squared_error')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. RidgeCV(alphas=array([1.00000000e-04, 3.59381366e-04, 1.29154967e-03, 4.64158883e-03,
1.66810054e-02, 5.99484250e-02, 2.15443469e-01, 7.74263683e-01,
2.78255940e+00, 1.00000000e+01]),
cv=5, scoring='neg_mean_squared_error')설명력 (R²): 0.730
최적 α (alpha): 0.0001
▶ 전체 샘플 수: 453개
▶ 허위매물 수: 13개 (2.9%)
| Neighborhood | SalePrice | predicted | residual | |
|---|---|---|---|---|
| 275 | Veenker | 150000 | 377583.040255 | -227583.040255 |
| 1008 | Timber | 204000 | 331268.591361 | -127268.591361 |
| 1686 | NoRidge | 285000 | 383469.594440 | -98469.594440 |
| 111 | Somerst | 172500 | 267553.768343 | -95053.768343 |
| 1310 | StoneBr | 270000 | 357997.110005 | -87997.110005 |
| 300 | Somerst | 280750 | 363476.267632 | -82726.267632 |
| 1398 | Timber | 202900 | 281247.127845 | -78347.127845 |
| 1278 | Somerst | 170000 | 248322.760527 | -78322.760527 |
| 1495 | NoRidge | 248000 | 324732.678440 | -76732.678440 |
| 949 | NoRidge | 290000 | 364891.692276 | -74891.692276 |
| 51 | Somerst | 193800 | 268322.417291 | -74522.417291 |
| 1411 | StoneBr | 130000 | 204137.130175 | -74137.130175 |
| 2134 | Somerst | 345000 | 416042.158581 | -71042.158581 |